Want Speed? Pass by Value.

This entry is part of a series, RValue References: Moving Forward»

Be honest: how does the following code make you feel?

std::vector<std::string> get_names();
…
std::vector<std::string> const names = get_names();

Frankly, even though I should know better, it makes me nervous. In principle, when get_names() returns, we have to copy a vector of strings. Then, we need to copy it again when we initialize names, and we need to destroy the first copy. If there are N strings in the vector, each copy could require as many as N+1 memory allocations and a whole slew of cache-unfriendly data accesses as the string contents are copied.

Rather than confront that sort of anxiety, I’ve often fallen back on pass-by-reference to avoid needless copies:

get_names(std::vector<std::string>& out_param );
…
std::vector<std::string> names;
get_names( names );

Unfortunately, this approach is far from ideal.

The code grew by 150%
We’ve had to drop const-ness because we’re mutating names.
As functional programmers like to remind us, mutation makes code more complex to reason about by undermining referential transparency and equational reasoning.
We no longer have strict value semantics¹ for names.

But is it really necessary to mess up our code in this way to gain efficiency? Fortunately, the answer turns out to be no (and especially not if you are using C++0x). This article is the first in a series that explores rvalues and their impliciations for efficient value semantics in C++.

RValues

Rvalues are expressions that create anonymous temporary objects. The name rvalue refers to the fact that an rvalue expression of builtin type can only appear on the right-hand side of an assignment. Unlike lvalues, which, when non-const, can always be used on the left-hand-side of an assignment, rvalue expressions yield objects without any persistent identity to assign into.²

The important thing about anonymous temporaries for our purposes, though, is that they can only be used once in an expression. How could you possibly refer to such an object a second time? It doesn’t have a name (thus, “anonymous”); and after the full expression is evaluated, the object is destroyed (thus, “temporary”)!

Once you know you are copying from an rvalue, then, it should be possible to “steal” the expensive-to-copy resources from the source object and use them in the target object without anyone noticing. In this case that would mean transferring ownership of the source vector’s dynamically-allocated array of strings to the target vector. If we could somehow get the compiler to execute that “move” operation for us, it would be cheap–almost free–to initialize names from a vector returned by-value.

That would take care of the second expensive copy, but what about the first? When get_names returns, in principle, it has to copy the function’s return value from the inside of the function to the outside. Well, it turns out that return values have the same property as anonymous temporaries: they are about to be destroyed, and won’t be used again. So, we could eliminate the first expensive copy in the same way, transferring the resources from the return value on the inside of the function to the anonymous temporary seen by the caller.

Copy Elision and the RVO

The reason I kept writing above that copies were made “in principle” is that the compiler is actually allowed to perform some optimizations based on the same principles we’ve just discussed. This class of optimizations is known formally as copy elision. For example, in the Return Value Optimization (RVO), the calling function allocates space for the return value on its stack, and passes the address of that memory to the callee. The callee can then construct a return value directly into that space, which eliminates the need to copy from inside to outside. The copy is simply elided, or “edited out,” by the compiler. So in code like the following, no copies are required:

std::vector<std::string> names = get_names();

Also, although the compiler is normally required to make a copy when a function parameter is passed by value (so modifications to the parameter inside the function can’t affect the caller), it is allowed to elide the copy, and simply use the source object itself, when the source is an rvalue.

std::vector<std::string> 
sorted(std::vector<std::string> names)
{
    std::sort(names);
    return names;
}
 
// names is an lvalue; a copy is required so we don't modify names
std::vector<std::string> sorted_names1 = sorted( names );
 
// get_names() is an rvalue expression; we can omit the copy!
std::vector<std::string> sorted_names2 = sorted( get_names() );

This is pretty remarkable. In principle, in line 12 above, the compiler can eliminate all the worrisome copies, making sorted_names2 the same object as the one created in get_names(). In practice, though, the principle won’t take us quite that far, as I’ll explain later.

Implications

Although copy elision is never required by the standard, recent versions of every compiler I’ve tested do perform these optimizations today. But even if you don’t feel comfortable returning heavyweight objects by value, copy elision should still change the way you write code.

Consider this cousin of our original sorted(…) function, which takes names by const reference and makes an explicit copy:

std::vector<std::string> 
sorted2(std::vector<std::string> const& names) // names passed by reference
{
    std::vector<std::string> r(names);        // and explicitly copied
    std::sort(r);
    return r;
}

Although sorted and sorted2 seem at first to be identical, there could be a huge performance difference if a compiler does copy elision. Even if the actual argument to sorted2 is an rvalue, the source of the copy, names, is an lvalue,³ so the copy can’t be optimized away. In a sense, copy elision is a victim of the separate compilation model: inside the body of sorted2, there’s no information about whether the actual argument to the function is an rvalue; outside, at the call site, there’s no indication that a copy of the argument will eventually be made.

That realization leads us directly to this guideline:

Guideline: Don’t copy your function arguments. Instead, pass them by value and let the compiler do the copying.

At worst, if your compiler doesn’t elide copies, performance will be no worse. At best, you’ll see an enormous performance boost.

One place you can apply this guideline immediately is in assignment operators. The canonical, easy-to-write, always-correct, strong-guarantee, copy-and-swap assignment operator is often seen written this way:

T& T::operator=(T const& x) // x is a reference to the source
{ 
    T tmp(x);          // copy construction of tmp does the hard work
    swap(*this, tmp);  // trade our resources for tmp's
    return *this;      // our (old) resources get destroyed with tmp 
}

but in light of copy elision, that formulation is glaringly inefficient! It’s now “obvious” that the correct way to write a copy-and-swap assignment is:

T& operator=(T x)    // x is a copy of the source; hard work already done
{
    swap(*this, x);  // trade our resources for x's
    return *this;    // our (old) resources get destroyed with x
}

Reality Bites

Of course, lunch is never really free, so I have a couple of caveats.

First, when you pass parameters by reference and copy in the function body, the copy constructor is called from one central location. However, when you pass parameters by value, the compiler generates calls to the copy constructor at the site of each call where lvalue arguments are passed. If the function will be called from many places and code size or locality are serious considerations for your application, it could have a real effect.

On the other hand, it’s easy to build a wrapper function that localizes the copy:

std::vector<std::string> 
sorted3(std::vector<std::string> const& names)
{
    // copy is generated once, at the site of this call
    return sorted(names);
}

Since the converse doesn’t hold—you can’t get back a lost opportunity for copy elision by wrapping—I recommend you start by following the guideline, and make changes only as you find them to be necessary.

Second, I’ve yet to find a compiler that will elide the copy when a function parameter is returned, as in our implementation of sorted. When you think about how these elisions are done, it makes sense: without some form of inter-procedural optimization, the caller of sorted can’t know that the argument (and not some other object) will eventually be returned, so the compiler must allocate separate space on the stack for the argument and the return value.

If you need to return a function parameter, you can still get near-optimal performance by swapping into a default-constructed return value (provided default construction and swap are cheap, as they should be):

std::vector<std::string> 
sorted(std::vector<std::string> names)
{
    std::sort(names);
    std::vector<std::string> ret;
    swap(ret, names);
    return ret;
}

More To Come

Hopefully you now have the ammunition you need to stave off anxiety about passing and returning nontrivial objects by value. But we’re not done yet: now that we’ve covered rvalues, copy elision, and the RVO, we have all the background we need to attack move semantics, rvalue references, perfect forwarding, and more as we continue this article series. See you soon!

Follow this link to the next installment.

Acknowledgements

Howard Hinnant is responsible for key insights that make this article series possible. Andrei Alexandrescu was posting on comp.lang.c++.moderated about how to leverage copy elision years before I took it seriously. Most of all, though, thanks in general to all readers and reviewers!

Googling for a good definition of value semantics turned up nothing for me. Unless someone else can point to one (and maybe even if they can), we’ll be running an article on that topic—in which I promise you a definition—soon. ↩
For a detailed treatment of rvalues and lvalues, please see this excellent article by Dan Saks ↩
Except for enums and non-type template parameters, every value with a name is an lvalue. ↩

Entries in this series:

Posted Saturday, August 15th, 2009 under Value Semantics.

Tags: semantics, value

enum PieceTypes { King, Queen, Rook, Bishop, Knight, Pawn }; class Position; class Move; // C++03 style move generation template<PieceTypes> void generate(const Position&, std::vector<Move>& moves); void generate(const Position& p, std::vector<Move>& moves) { generate<King>(p, moves); generate<Queen>(p, moves); generate<Rook>(p, moves); generate<Bishop>(p, moves); generate<Knight>(p, moves); generate<Pawn>(p, moves); } // call as: Position p; std::vector<Move> moves; generate(p, moves);

// C++11 style move generation // infix operator+ to do append on std::vector as with std::string // append is associative but not commutative template<typename T> std::vector<T> operator+(std::vector<T> lhs, const std::vector<T>& rhs) { lhs.insert(end(lhs), begin(rhs), end(rhs)); return lhs; } template<PieceTypes> std::vector<Move> generate(const Position& p); std::vector<Move> generate(const Position& p); { return ( generate<King>(p) + generate<Queen>(p) + generate<Rook>(p) + generate<Bishop>(p) + generate<Knight>(p) + generate<Pawn>(p) ); } // call as: Position p; auto moves = generate(p);

template<typename T> std::vector<T> operator+(const std::vector<T>& lhs, const std::vector<T>& rhs) { auto tmp = lhs; tmp.insert(tmp.end(), rhs.begin(), rhs.end()); return tmp; } template<typename T> std::vector<T> operator+(const std::vector<T>& lhs, std::vector<T>&& rhs) { rhs.insert(rhs.begin(), lhs.begin(), lhs.end()); return std::move(rhs); } template<typename T> std::vector<T> operator+(std::vector<T>&& lhs, const std::vector<T>& rhs) { lhs.insert(lhs.end(), rhs.begin(), rhs.end()); return std::move(lhs); } template<typename T> std::vector<T> operator+(std::vector<T>&& lhs, std::vector<T>&& rhs) { lhs.insert(lhs.end(), rhs.begin(), rhs.end()); return std::move(lhs); }

std::vector<Move> generate(const Position& p) { std::vector<Move> moves; generate<King>(p, moves); generate<Queen>(p, moves); generate<Rook>(p, moves); generate<Bishop>(p, moves); generate<Knight>(p, moves); generate<Pawn>(p, moves); return moves; }

std::vector<Move> generate(const Position& p) { std::vector<Move> moves; moves.reserve(32); generate<King>(p, moves); .... return moves; }

#include <cstddef> #include <new> #include <utility> template <class T> class MyVector { T* begin_; T* end_; T* capacity_; public: MyVector() : begin_(nullptr), end_(nullptr), capacity_(nullptr) {} ~MyVector() { clear(); ::operator delete(begin_); } MyVector(std::size_t N, const T& t) : MyVector() { if (N > 0) { begin_ = end_ = static_cast<T*>(::operator new(N*sizeof(T))); capacity_ = begin_ + N; for (; N > 0; --N, ++end_) ::new(end_) T(t); } } MyVector(const MyVector& v) : MyVector() { std::size_t N = v.size(); if (N > 0) { begin_ = end_ = static_cast<T*>(::operator new(N*sizeof(T))); capacity_ = begin_ + N; for (std::size_t i = 0; i < N; ++i, ++end_) ::new(end_) T(v[i]); } } MyVector(MyVector&& v) : begin_(v.begin_), end_(v.end_), capacity_(v.capacity_) { v.begin_ = nullptr; v.end_ = nullptr; v.capacity_ = nullptr; } #ifndef USE_SWAP_ASSIGNMENT MyVector& operator=(const MyVector& v) { if (this != &v) { std::size_t N = v.size(); if (capacity() < N) { clear(); ::operator delete(begin_); begin_ = end_ = static_cast<T*>(::operator new(N*sizeof(T))); capacity_ = begin_ + N; } T* p = begin_; const T* q = v.begin_; for (; p < end_ && q < v.end_; ++p, ++q) *p = *q; if (q < v.end_) { for (; q < v.end_; ++q, ++end_) ::new(end_) T(*q); } else { while (end_ > p) { --end_; end_->~T(); } } } return *this; } MyVector& operator=(MyVector&& v) { clear(); swap(v); return *this; } #else MyVector& operator=(MyVector v) { swap(v); return *this; } #endif void clear() { while (end_ > begin_) { --end_; end_->~T(); } } std::size_t size() const {return static_cast<std::size_t>(end_ - begin_);} std::size_t capacity() const {return static_cast<std::size_t>(capacity_ - begin_);} const T& operator[](std::size_t i) const {return begin_[i];} T& operator[](std::size_t i) {return begin_[i];} void swap(MyVector& v) { std::swap(begin_, v.begin_); std::swap(end_, v.end_); std::swap(capacity_, v.capacity_); } }; template <class T> inline void swap(MyVector<T>& x, MyVector<T>& y) { x.swap(y); } #include <iostream> #include <string> #include <chrono> int main() { MyVector<std::string> v1(1000, "1234567890123456789012345678901234567890"); MyVector<std::string> v2(1000, "1234567890123456789012345678901234567890123456789"); typedef std::chrono::high_resolution_clock Clock; typedef std::chrono::duration<double, std::micro> US; auto t0 = Clock::now(); v2 = v1; auto t1 = Clock::now(); std::cout << US(t1-t0).count() << " microseconds\n"; } $ clang++ -stdlib=libc++ -std=c++11 -O3 -DUSE_SWAP_ASSIGNMENT test.cpp $ a.out 174.516 microseconds $ a.out 180.83 microseconds $ a.out 175.848 microseconds $ clang++ -stdlib=libc++ -std=c++11 -O3 test.cpp $ a.out 26.339 microseconds $ a.out 24.179 microseconds $ a.out 24.103 microseconds

#include #include #include #define SIZE 1000 class Big { public: Big() { std::cout << "Big constructor" << std::endl; //res = malloc(SIZE); } ~Big() { std::cout << "Big destructor" << std::endl; //free(res); } static Big Build() { Big big; memset(big.res, 0, SIZE); std::cout << "first " << &big << std::endl; return big; } static Big Revise(Big value) { std::cout << "second " << &value << std::endl; memset(value.res, 1, SIZE); return value; } static Big Revise1(Big const& ref) { Big value(ref); std::cout << "second " << &value << std::endl; memset(value.res, 1, SIZE); return value; } private: char res[SIZE]; }; int _tmain(int argc, _TCHAR* argv[]) { Big b = Big::Revise(Big::Build()); std::cout << "third " << &b << std::endl; std::cout << std::endl; Big c = Big::Revise1(Big::Build()); std::cout << "third1 " << &c << std::endl; return 0; }

X nrvo_source(bool b) { trace t("nrvo_source"); if (b) { X a; return a; } else { X b ; return b ; } } X urvo_source(bool b) { trace t("urvo_source"); if (b) return X() ; else return X() ; }

struct string { const char* str; string(const char* str) : str(str) { puts("constructor"); } string(const string& other) : str(other.str) { puts("copy constructor"); } }; struct holder_1 { string s; holder_1(const string& s) : s(s) {} }; int main() { holder_1("some string"); }

struct T { T() { cout << "+ " << this << endl; } T(T && f) { cout << "+ " << this << " move " << &f << endl; } T(T const& f) =delete;//{ cout << "+ " << this << " copy " << &f << endl; } }; T func (T f) { return T {}; } int main () { T t2 = func(T {}); }

AK says:

January 29, 2014 (1 week ago) at 1:42 am

Great article, but my experimentation with VS2012 shows that references are faster in some circumstances.

I tested a small class that printed when objects of that class were Copy, Move or Default constructed. I then used the following two small tests, similar to your vector sorting functions.

MyClass s;

MyClass refRet = DoByRef(s); MyClass copyRet = DoByCopy(s);

With the implementations being:

MyClass DoByRef(const MyClass &s) { MyClass ret(s); ret.DoSomething(); return ret; }

MyClass DoByCopy(MyClass s) { MyClass ret(s); ret.DoSomething(); return ret; }

The results were that DoByRef does a total of one Copy construction, and DoByCopy does two. Now this could be a result of all the code being in one module, I’m not sure yet, but it does show that at times, passing references are considerably faster.

Quote

AK says:

January 29, 2014 (1 week ago) at 3:03 am

OK looking a little more into it, if I construct ret with a std::move(s) instead, and pass an rvalue to DoByCopy, then I get a total of one move. If I pass an lvalue to it, I get a copy and a move (Slightly slower than just passing by reference which is just one copy). However this only makes passing by value useful if you’re passing in an rvalue. If you pass in an lvalue for some reason, you’re actually slower than just using a reference.

So basically I’m not convinced (yet) that it’s worth the trouble to stop using pass by reference in real code.

AKQuote

David says:

October 30, 2013 at 11:53 pm

Any chance you could fix the broken link at the bottom of the article please?

“Follow this link to the next installment.”

Vitali says:

September 17, 2013 at 4:09 am

Does the same rule apply if your type is largish (let’s say 128 bytes)? To me, it seems like a pass-by-value would be pretty expensive since the swap or rvalue move will still effectively be a copy, thereby causing 2 copies of the data instead of 1. In the case where you are supplying an rvalue, you end up copying the data twice as well since the move into the local variable will be a copy as will the swap. Thus the pass-by-value case will always involve two copies.

Thus to me it seems like for larger types you should still use const&. If the type can be moved more efficiently than copied, than APIs using it should provide an additional && API.

Noah Roberts says:

September 16, 2013 at 1:33 am

I heard about this article when it was cited in Going Native 2013 where they were recommending to use value semantics by default.

I can see why this improves performance when dealing with temporaries and/or things that can be moved. There’s a case I worry about though and I wonder what you have to say on the matter.

My worry is that the assumptions are not documented by the signature. The assumption is that I’m going to gain speed because of copy ellision, but if I change the calling site such that that can no longer be done by instead passing an lvalue then I lose that performance. For example, I my find that I want to use the same object again so I store the temporary into a variable. Now I’m paying the cost of copies when a reference would have served the same purpose.

That in and of itself doesn’t bother me, what bothers me is that this will happen without any warning. The semantic use of the function completely alters the performance, or am I missing something?

What would you recommend to avoid this issue?

Andrie Suak Tiwa says:

May 24, 2013 at 10:06 am

Absolutely Brilliant! Thanks!

i will always pass by value for most situations.

Paul says:

May 27, 2012 at 10:35 am

my brain hurts after reading this. Do things really need to be so complicated in C++?

Dave Abrahams says:

June 4, 2012 at 8:09 pm

Not if you don’t care about performance. The price of performance is dealing with issues closer to the machine model. You can cover those issues up with “pretty” language abstractions like garbage collection… but then you have to give the performance back

Dave AbrahamsQuote
- JR says:
  
  March 29, 2013 at 6:28 pm
  
  I sympathise with Paul’s lament.
  
  If anyone was producing a new high-performance language and they considered the use case of a user procedure to initialise a read-only vector of strings would they come up with such a pig’s ear of a solution as C++ has ended up with? C++’s excuse is C.
  
  Its got nothing to do with garbage collection.
  
  JRQuote
  - Marcel Kincaid says:
    
    April 3, 2013 at 2:12 am
    
    You’re wrong. This has nothing to do with C, and it has everything to do with the fact that C++ doesn’t garbage collect. If you think otherwise, go ahead and try to design your high-performance language.
    
    Marcel KincaidQuote
- Jakob says:
  
  December 7, 2013 at 4:13 am
  
  In java, the above issue doesn’t exist since you have pointers for everything, which is pretty neat in my opinion. Then of course, you have garbage collection instead.
  
  I’ve been thinking if you couldn’t do the same with smart pointers, and still avoid having garbage collection. It’s just that smart pointers are so ugly in C++…
  
  JakobQuote

Return Value Optimisation « Squiggly Brackets says:

March 26, 2012 at 12:34 pm

rhalbersma says:

February 6, 2012 at 9:24 am

In a chess program, moves can be generated and stored in a std::vector. Most programs have split the move generation by piece type, and they pass a std::vector by reference (or pointer) to append the various parts together. Code would typically look like this:

Translating this to C++11 style with std::vector return-by-value runs into the small problem that the standard containers have no infix operators to append containers in the same way as one can do with std::string. Adding a template operator+ that also has a pass-by-value left argument will fix this.

Of course, the use of the operator+ can be debated here. For numeric applications one might also use operator+ to do element-by-element addition.

rhablersma says:

February 9, 2012 at 2:00 pm
OK, I guess I should have asked a question to generate a reply. So here’s a question: how can I make the above code avoid all unnecessary copies? Reading from the later installments of this blog series, I figure I need 4 overloads of operator+. Here’s a first try:
```
template<typename T>
std::vector<T> operator+(const std::vector<T>& lhs, const std::vector<T>& rhs)
{
        auto tmp = lhs;
        tmp.insert(tmp.end(), rhs.begin(), rhs.end());
        return tmp;
}
 
template<typename T>
std::vector<T> operator+(const std::vector<T>& lhs, std::vector<T>&& rhs)
{
        rhs.insert(rhs.begin(), lhs.begin(), lhs.end());
        return std::move(rhs);
}
 
template<typename T>
std::vector<T> operator+(std::vector<T>&& lhs, const std::vector<T>& rhs)
{
        lhs.insert(lhs.end(), rhs.begin(), rhs.end());
        return std::move(lhs);
}
 
template<typename T>
std::vector<T> operator+(std::vector<T>&& lhs, std::vector<T>&& rhs)
{
        lhs.insert(lhs.end(), rhs.begin(), rhs.end());
        return std::move(lhs);
}
```
It’s different from the Matrix example in this blog because operator+ is not commutative, and it’s also different to the std::string example in the Nxxx standard proposal documents because std::vector does not have a built-in operator+=. So another question is: do I also need to have 4 overloads of operator+= to let std::vector have full append functionality? What signature would they have to have?
rhablersmaQuote
- Howard Hinnant says:
  
  February 27, 2012 at 12:04 am
  Your solution is fine as far as move semantics goes. A reasonable rewrite is to replace all 4 of your signatures with just:
  template<typename T> std::vector<T> operator+(std::vector<T> lhs, const std::vector<T>& rhs) { lhs.insert(lhs.end(), rhs.begin(), rhs.end()); return lhs; }
  That being said, I would be tempted to just do the following:
  std::vector<Move> generate(const Position& p) { std::vector<Move> moves; generate<King>(p, moves); generate<Queen>(p, moves); generate<Rook>(p, moves); generate<Bishop>(p, moves); generate<Knight>(p, moves); generate<Pawn>(p, moves); return moves; }
  This isn’t quite as “cute” but is perfectly efficient. And this also comes with a caveat. If in the original code the client is calling this over and over as in:
```
std::vector<Moves> moves;
while (...)
{
    // ...
    moves.clear();
    generate(p, moves);
    // ...
}
```
  Then you might consider leaving your code as is. Count trips to the heap. Whatever minimizes that count is the best solution. Don’t throw away vector capacity to then just allocate it back. If you can reuse capacity, doing so is always a win. If moves is likely to hold capacity prior to the call to generate then attempt to take advantage of it.
  Howard HinnantQuote
  - rhalbersma says:
    
    March 5, 2012 at 7:28 am
    Hi Howard,
    
    Thanks for your comment. I make an upfront reservation for the the move vector’s capacity, so passing the pointer around would minimize the number of heap allocations:
    
    std::vector<Move> generate(const Position& p) { std::vector<Move> moves; moves.reserve(32); generate<King>(p, moves); .... return moves; }
    
    One more question, though: would this still apply if I would use your stack allocator? So with
    
    std::vector<Move, stack_alloc<Move, 32> > moves;
    
    wouldn’t that make the “+” notation more viable?
    rhalbersmaQuote
Kos says:

December 23, 2012 at 4:18 pm

IMHO, as the functions are only concerned with “generating values”, instead of knowing about vectors they should just take an output_iterator as parameter and work with it.

Then you’d just make a back_inserter to your vector, pass it around and you’re clear to go.

KosQuote

Howard Hinnant says:

January 13, 2012 at 6:10 pm

I think this article should be updated for C++11. There are two things wrong with it:

It leaves the impression that one should always write your assignment operator like so:
```
T& operator=(T x)    // x is a copy of the source; hard work already done
{
    swap(*this, x);  // trade our resources for x's
    return *this;    // our (old) resources get destroyed with x
}
```
But in some important cases, this is a large performance penalty. Vector-like classes where heap memory can be reused during the copy assignment is a classic example. I’ve just written a short example showing as high as a 7X performance penalty.

In C++11 the correct way to write sorted is:

std::vector<std::string>
sorted(std::vector<std::string> names)
{
    std::sort(names.begin(), names.end());
    return names;
}

Implicit return-by-move from by-value parameters is now required.

The basic point of the article is sound: Passing by value is an important tool in the tool box. But I’ve seen too many references to this article that mistakenly throw design and testing out the window on this issue, and translate this article into “always pass by value”.

Dave Abrahams says:

January 13, 2012 at 7:33 pm

1000% agreed

Dave AbrahamsQuote
- Matthias Vallentin says:
  
  February 14, 2012 at 12:32 am
  In the case of C++11, wouldn’t it make sense to always use the by-value version of the assignment operator if a move constructor is provided in addition?
```
T(T&& x)
{
     // Steal resources from x.
}

T& operator=(T x)    // uses T(T&&) iff x is an r-value
{
    swap(*this, x);
    return *this;
}
```
  Matthias VallentinQuote
  - Howard Hinnant says:
    
    February 26, 2012 at 11:30 pm
    I think you just hammered my first point above home. Trying code:
    
    #include <cstddef> #include <new> #include <utility> template <class T> class MyVector { T* begin_; T* end_; T* capacity_; public: MyVector() : begin_(nullptr), end_(nullptr), capacity_(nullptr) {} ~MyVector() { clear(); ::operator delete(begin_); } MyVector(std::size_t N, const T& t) : MyVector() { if (N > 0) { begin_ = end_ = static_cast<T*>(::operator new(N*sizeof(T))); capacity_ = begin_ + N; for (; N > 0; --N, ++end_) ::new(end_) T(t); } } MyVector(const MyVector& v) : MyVector() { std::size_t N = v.size(); if (N > 0) { begin_ = end_ = static_cast<T*>(::operator new(N*sizeof(T))); capacity_ = begin_ + N; for (std::size_t i = 0; i < N; ++i, ++end_) ::new(end_) T(v[i]); } } MyVector(MyVector&& v) : begin_(v.begin_), end_(v.end_), capacity_(v.capacity_) { v.begin_ = nullptr; v.end_ = nullptr; v.capacity_ = nullptr; } #ifndef USE_SWAP_ASSIGNMENT MyVector& operator=(const MyVector& v) { if (this != &v) { std::size_t N = v.size(); if (capacity() < N) { clear(); ::operator delete(begin_); begin_ = end_ = static_cast<T*>(::operator new(N*sizeof(T))); capacity_ = begin_ + N; } T* p = begin_; const T* q = v.begin_; for (; p < end_ && q < v.end_; ++p, ++q) *p = *q; if (q < v.end_) { for (; q < v.end_; ++q, ++end_) ::new(end_) T(*q); } else { while (end_ > p) { --end_; end_->~T(); } } } return *this; } MyVector& operator=(MyVector&& v) { clear(); swap(v); return *this; } #else MyVector& operator=(MyVector v) { swap(v); return *this; } #endif void clear() { while (end_ > begin_) { --end_; end_->~T(); } } std::size_t size() const {return static_cast<std::size_t>(end_ - begin_);} std::size_t capacity() const {return static_cast<std::size_t>(capacity_ - begin_);} const T& operator[](std::size_t i) const {return begin_[i];} T& operator[](std::size_t i) {return begin_[i];} void swap(MyVector& v) { std::swap(begin_, v.begin_); std::swap(end_, v.end_); std::swap(capacity_, v.capacity_); } }; template <class T> inline void swap(MyVector<T>& x, MyVector<T>& y) { x.swap(y); } #include <iostream> #include <string> #include <chrono> int main() { MyVector<std::string> v1(1000, "1234567890123456789012345678901234567890"); MyVector<std::string> v2(1000, "1234567890123456789012345678901234567890123456789"); typedef std::chrono::high_resolution_clock Clock; typedef std::chrono::duration<double, std::micro> US; auto t0 = Clock::now(); v2 = v1; auto t1 = Clock::now(); std::cout << US(t1-t0).count() << " microseconds\n"; } $ clang++ -stdlib=libc++ -std=c++11 -O3 -DUSE_SWAP_ASSIGNMENT test.cpp $ a.out 174.516 microseconds $ a.out 180.83 microseconds $ a.out 175.848 microseconds $ clang++ -stdlib=libc++ -std=c++11 -O3 test.cpp $ a.out 26.339 microseconds $ a.out 24.179 microseconds $ a.out 24.103 microseconds
    Howard HinnantQuote
    - Matthias Vallentin says:
      
      February 28, 2012 at 6:26 am
      
      Thanks for posting the example, that your point in the previous comment clear.
      
      Matthias
      
      Matthias VallentinQuote
    - Agnostic says:
      
      November 9, 2012 at 11:03 am
      
      But what if exception is thrown while doing *p = *q; in “optimal” operator=. We have a vector with partially copyed items. Isn’t this a case when we are trading safety for speed?
      
      BTW: we have a bug in this vector for the case if (capacity() < N) the end_ pointer equals to begin_ and is not updated by += N below.
      
      AgnosticQuote
    - coder says:
      
      January 9, 2013 at 5:00 am
      
      I replicated this test using gcc-4.7.2 -std=c++11 -O3
      
      compiled with -DUSE_SWAP_ASSIGNMENT 47 microseconds 48 microseconds 48 microseconds 47 microseconds
      
      without -DUSE_SWAP_ASSIGNMENT 36 microseconds 36 microseconds 36 microseconds 36 microseconds
      
      That yields a .32X performance difference. Could not reproduce your 7x. But even .32X is very significant.
      
      coderQuote
  - Andrea says:
    
    March 29, 2012 at 4:16 pm
    
    Wow! Tried to understand what’s going on here… Am I right in assuming the difference is in the ::new(end_) T(v[i]) copy construction loop in cctor (used when -DUSE_SWAP_ASSIGNMENT) vs the *p = *q assignment loop when using “normal” MyVector& operator=(const MyVector &) ??? But then why such a difference? There’s a placement new up there (i.e. not real memory allocations, only string’s cctor invocation), how can it be that worse than an assignment? I’m sure I’m missing something… Also, testing on a mac with latest clang from trunk and gcc-4.7 built from sources, I can’t see that timing difference when compiling with g++ -std=c+11. I.e., both old and swap-based assignments time almost the same as clang’s best case. Is the different size of std::string (8 bytes in gnu’s libstdc++ vs 24 bytes in clang’s libc++) the cause of gcc’s insensitivity to the kind of assignment used?
    
    Andrea
    
    AndreaQuote
    - Howard Hinnant says:
      
      March 30, 2012 at 7:41 pm
      
      Think of it this way: the most efficient way to recycle something is to re-use it. The copy assignment operator can sometimes re-use memory, instead of deallocating it and then allocating more. That is what is happening in this example. One way deallocates memory just to turn around and allocate it back. The other way holds on to its memory and re-uses it for the new value. The optimization is to simply avoid calling new/delete as much as you can.
      
      I imagine the difference you’re seeing with gcc is that they are using a ref-counted string. Try the experiment again, but using MyVector<std::vector<int>> instead.
      
      Howard HinnantQuote
    - Andrea says:
      
      April 2, 2012 at 8:34 pm
      
      Howard, thanks for your helpful reply. I think I’m getting hold of it now. Can you just confirm my understanding is right when I say that: When using vectors of object that have “external” resources (i.e. allocate memory on the heap as it is the case for strings), going through the route of invoking their constructor (as when doing ::new(end_) T(v[i]) in the copy constructor loop used when -DUSE_SWAP_ASSIGNMENT) makes you incur in the penalty of allocations even if those are placement news. Instead, the p=q assignments in the plain old assignment operator’s loop can re-use the already allocated memory on the destination (in particular when as in this case, the destination is a longer string) and make this approach more efficient. As you suggested, using vectors of int (or I think more generally PODs/aggregate objects) levels out the difference of the two approaches because that extra price during the placement new has not to be payed.
      
      Andrea
      
      AndreaQuote
    - Howard Hinnant says:
      
      April 2, 2012 at 8:42 pm
      
      That sounds right.
      
      Howard HinnantQuote

Andrzej Krzemieński says:

November 21, 2011 at 1:21 pm

Hi Dave, I tried to apply the idiom for copy assignment you describe, but I encountered one suspicious nuance when trying to specify exception specification for my copy assignment. I have a “gut feeling” that something is wrong, although I cannot clearly specify it. Here is my problem. Without passing by value I would specify my assigment like this:

T& T::operator=( T const& x )
{ 
  T tmp(x);          // can throw
  swap(*this, tmp);  // no-throw (let's assume)
  return *this; 
}

This really says what I need to do to assign the value from one object represented by reference ‘x’. I need to make a copy first, and swap it with my value. Can this operation throw? Surely: a copy constructor is a typical place where one would expect a throw.

Now, is the answer the same for the “pass-by-value” idiom?

T& operator=(T x)    // this copying is not inside the assignment
{
    swap(*this, x);  // no-throw
    return *this;  
}

Technically we are doing the same thing, but copying is somehow ejected outside of the function. The function only does a no-fail (let’s assume that) swap. So in fact, I can write:

T& operator=(T x) noexcept
{
    swap(*this, x);  // no-throw
    return *this;  
}

I am telling the truth: there is nothing in the copy assignment that could cause a throw. But anyone who tries to use the assignment may throw, beacuse our function even thouh it does not copy itself, forces you to copy T, even though you are not (or may be not) aware of it. That is, by declaring the function like this (with noexcept), while technically being correct, I confuse everyone by implying that using this assignment operator does not raise exceptions. I would be more honest if I wrote:

T& operator=(T x) noexcept( std::is_nothrow_copy_constructible<T>::value )
{
    swap(*this, x);  // no-throw
    return *this;  
}

But this also looks strange: why would I base the condition on the properties of the constructor that I never call?

Dave Abrahams says:

December 28, 2011 at 10:30 pm
Hi Andrzej,

I wish this were crisper, but I would say:
- operator= is unconditionally noexcept
- assignment from an lvalue is noexcept if T’s copy constructor is noexcept
- assignment from an rvalue is noexcept if move-constructing a T is noexcept
Dave AbrahamsQuote

litb says:

August 19, 2011 at 10:00 pm

“Except for enums, every value with a name is an lvalue.”. I know I’m going to annoy you by this, but I want to inform the innocent reader that integer, pointer and member pointer template parameters aren’t lvalues either.

Dave Abrahams says:

August 24, 2011 at 2:36 am

Good point; fixed, thanks!

Dave AbrahamsQuote
Knowing me, knowing you, a-ha says:

October 8, 2011 at 12:23 pm

You forgot about floats. They aren’t lvalues either. And few more other things as well.

Knowing me, knowing you, a-haQuote
- Dave Abrahams says:
  
  October 9, 2011 at 1:14 am
  
  No, a named float most certainly is an lvalue.
  
  Dave AbrahamsQuote
  - Knowing me, knowing you, a-ha says:
    
    October 11, 2011 at 5:59 am
    
    So let me get this straight, named integer isn’t lvalue but named float is?
    
    Knowing me, knowing you, a-haQuote
    - Dave Abrahams says:
      
      October 11, 2011 at 6:13 am
      
      No, unless that integer is a non-type template parameter, they’re both lvalues.
      
      Dave AbrahamsQuote
    - Knowing me, knowing you, a-ha says:
      
      October 11, 2011 at 6:35 am
      
      Yes, he talks about template parameters.
      
      Knowing me, knowing you, a-haQuote
    - Knowing me, knowing you, a-ha says:
      
      October 11, 2011 at 7:45 am
      
      Thanks Dave, missed the part about template params
      
      Knowing me, knowing you, a-haQuote

someguy says:

August 11, 2011 at 9:56 pm

With sorted3, copy elision seems to be more complicated. As far as the function is concerned, the argument is an lvalue, so unless the sorted function knows that the argument passed to the sorted3 function is an rvalue, it can’t perform copy elision. Or have I misunderstood? If this is the case, then it must be capable of interprocedural optimization, right? Why can’t it elide when the function parameter is returned then?

Tony Van Eerd says:

July 19, 2011 at 4:32 pm

Note also the possible semantic difference between pass by value and pass by reference: For example:

struct B { int i; };
struct D : B { int j; };

void by_ref(B const & b)
{
}
void by_value(B b)
{
}

by_ref() takes “anything thay isa B” ie anything derived from B, whereas by_value() takes a B, and B only.

… or so it seems. Actually, assuming a B copy-constructor of the form B::B(B const &), then by_value(d) still works, via implicit by_value(B::B(d)).

B‘s copy constructor would need to be explicit to avoid this.

Not sure how big of a deal that is, but passing by value (while using explicit constructors) could prevent slicing in some situations.

(In fact, in general, the “slicing” of a Derived when constructing a Base from a Derived might be surprising in some situations. ie I suspect many people don’t think of their copy-constructor being used as a slicer. Or being ‘polymorphic’ in some way. Interesting…)

Tal Agmon says:

July 4, 2011 at 11:31 am

Except for enums, every value with a name is an lvalue Did you mean rvalue?

kaalus says:

February 22, 2011 at 9:13 am

Forgive me the rant. The article is very good, and I really admire people like Dave Abrahams, who keep in touch with all this, despite the complexity.

February 22, 2011 at 9:06 am

It seems to me that C++ has got itself into such a blind alley.

After reading the article, step back and have a look: such a basic thing, passing/returning values from functions. Yet it is so complex, full of traps. It takes many pages to explain, and it requires the reader to have many years of C++ experience to be really able to grasp the explanations, and apply them succesfully.

Can anyone be expected to write reliable software, solving complex problems, if you need 10 years of experience to just reliably return a value from a function, without shooting yourself in the foot? But this article shows only a tip of the iceberg, really. What a complete horror this becomes when you factor in variadic templates, template specializations, overloading rules, overriding rules, lambda expressions, SFINAE rules, type promotions. With this on, you can pretty much never be sure your code is correct, let alone optimal.

C++ has become clever, way too clever for an ordinary programmer to use effectively.

It no longer looks good even on toy problems – too many caveats.

As a result, average programmer uses C++ in a shoddy way, resulting in buggy, suboptimal code. This is what 95% C++ programmers in the wild are doing, from my experience.

The remaining 5%, who have OCD and are really determined to do all things right without cutting corners, end up agonizing for hours on every trivial function definition, all in the spirit of the above article. Of course they get nowhere for weeks.

C++ is a tool. Tools are for making people’s lives easier. C++ doesn’t anymore. It creates more problems than it solves.

peterchen says:

February 24, 2011 at 6:53 am

Your rant is definitely justified. My viewpoint differs in some aspects – so here’s mine:

In defense of C++: Never underestimate “simple”. Once you look deeper into it, it often gets terribly complicated – or in other words, it’s amazing on what a tower of shoulders we stand. To blame the same on other languages: garbage collection is simple – unless you look into it. Who would have thought cleaning up objects would be so terribly complicated?

Picking the wrong way to pass a parameter is rarely shooting yourself in the foot. going by the “simpler rules” (pass by reference to avoid copies) will virtually never be wrong. Even indiscriminately passing by value will be good enough most of the time.

All code of given complexity is shoddy, incomplete and questionable. The quesiton is: is it good enough for its purpose? (Our biggest problem here: change of purpose.) Knowing what corners you can cut and which you cannot makes you a good programmer. The “obsessive about everything guy” you describe is not (and I recognize myself in your OCD description).

Where I agree:

You have discovered the conundrum of choice: we pick C++ because it gives us choice – in the context of the example, we can pass by value or reference or pointer, or we might sit behind a template parameter and even not know how we pass. However, that choice is also what makes C++ hard, twice the choices isn’t always twice as good.

I am unhappy with the RValue references, because they increase the tax on a typical class without default copy implementaiton. OTOH, they do solve a problem of C++ while preserving the choices enabled by other features.

Number of features in a language is a fundamental problem for languages. Adding lambdas and templates and exceptions and rvalue references to the language make my job easier, because I can use them where appropriate. It also makes my job harder, because to understand your code, I might have to learn lambdas whether I like them or not. (And whether I would have used them or not. I think this dispartiy is a great source of rewrites: Great functionality, but they use exceptions, we use error codes. Cool solution, but I really hate template meta programming. etc.)

There is no universally perfect spot. C++ will see less use in large systems and desktop development. C++ will see more use in small microprocessor and embedded systems, because hardware and compilers are catching up just now. There’s still room for C++, and due to its variety, a lot of room.

C++ is a toolbox, not a single tool.

peterchenQuote
- Robert Ramey says:
  
  January 27, 2012 at 5:36 pm
  
  This is a great comment. It well captures the ying/yang of C++ development.
  
  I think the solution is building applications in layers of abstraction. This can permit all/most of the issues related to low level optimization to be addressed in lower layers while upper layers and ignore most of this. This is why I’m a fan of C++. Unfortunately, I think a lot of programing shops miss and just start coding rather than building applications layer by layer. Using C++ in this way causes lots of frustration and confusion.
  
  C++ is a toolbox for making other toolboxes.
  
  Robert Ramey
  
  Robert RameyQuote
AK says:

January 29, 2014 (1 week ago) at 2:17 am

Yup. I’ve been doing C++ for many years but the companies I worked in used very old version of the compilers that haven’t supported C++11. We’ve only switched a short while ago. Having to read and digest articles like this on something so simple is rather nuts. I, as most of us will, work it out and start to get used to it. However such a common task becoming so complicated is just insane.

AKQuote

x.martian says:

December 17, 2010 at 4:19 pm

This article was copied verbatim by someone in his blog:

prasanthmadhavan dot wordpress dot com @ /2010/11/26/the-r-value/

Dave Abrahams says:

December 17, 2010 at 6:48 pm

Amazingly, without attribution. How did you happen to find it?!

Dave AbrahamsQuote
- x.martian says:
  
  December 17, 2010 at 7:03 pm
  
  I stumbled on that doing research to convince myself that sorted() is better a style than sorted2(), or rather that you’re incorrect and it should be other way around.
  
  x.martianQuote
  - nick stokes says:
    
    January 5, 2011 at 5:31 pm
    
    I couldn’t realize at first what this was about, in case anyone was wondering Google has cached it at bit.ly/fJgRBF. Pillaging at its best, unbelievable!!
    
    nick stokesQuote

December 15, 2010 at 6:25 pm

I’m not sure that I agree with you that

std::vector
sorted2(std::vector const& names) // names passed by reference
{
    std::vector r(names); // and explicitly copied
    std::sort(r);
    return r;
}

be in general be less efficient, even with copy elision, than:

std::vector
sorted(std::vector names)
{
    std::sort(names);
    return names;
}

It seems to be that in either cases, at least one copy must be made. With return value elision, sorted2 requires no more than one copy either.

Did I miss anything?

Dave Abrahams says:

December 15, 2010 at 6:35 pm
Yes. I’m not sure what, but you did.

When the argument is an rvalue, the compiler is allowed to call the 2nd one with no copying. Now, the fact is that in practice, because of the way compilers implement function calls, it requires a copy today, but with move semantics no copy is needed. To avoid the copy today, you can do something like this:
```
std::vector
sorted(std::vector<std::string> names)
{
    std::sort(names);
    std::vector<std::string> names_;
    swap(names,names_);
    return names_;
}
```
Dave AbrahamsQuote
- x.martian says:
  
  December 15, 2010 at 7:09 pm
  the compiler is allowed to call the 2nd one with no copying.
  
  I can’t see why that is possible. In your example of
```
// get_names() is an rvalue expression; we can omit the copy!
std::vector sorted_names2 = sorted( get_names() );
```
  The object returned by get_names() is a temp object which will be out of the scope when sorted returns. One may argue that in principle, with RVO, sorted() receives the return value object from the caller, which in turn can pass it to get_names(). But such a scheme seems to violate the standard which requires all argument be evaluated BEFORE entering the callee.
  x.martianQuote
  - Dave Abrahams says:
    
    December 15, 2010 at 10:48 pm
    
    The temporary’s lifetime lasts to the end of the full expression. That includes the assignment.
    
    Dave AbrahamsQuote
    - x.martian says:
      
      December 16, 2010 at 4:36 pm
      
      The temporary’s lifetime lasts to the end of the full expression.That includes the assignment.
      
      But that’s not good enough. you are going to use sorted_names2 after the statement, don’t you? You need to preserve the values by copying them to sorted_names2 right? My argument is that you’ll have to make at least one copy of the values, either by using copy constructor, or by assignment. Hence, there’s no advantage of using 1.
      
      x.martianQuote
    - Dave Abrahams says:
      
      December 17, 2010 at 7:56 am
      
      Once you elide the copy, the source of the copy no longer gets destroyed; the lifetime becomes that of the thing it was “copied” into.
      
      Dave AbrahamsQuote
    - x.martian says:
      
      December 17, 2010 at 3:05 pm
      
      Once you elide the copy, the source of the copy no longer gets destroyed; the lifetime becomes that of the thing it was “copied” into.
      
      How would that be possible? The “source” of the copy is allocated in the stack and is popped out upon the return of sorted function.
      
      x.martianQuote
    - Dave Abrahams says:
      
      December 20, 2010 at 11:26 am
      
      As noted in the article itself, the compiler allocates space for the return value outside the stack frame of the function doing the returning. It constructs the source there.
      
      Dave AbrahamsQuote
- peterchen says:
  
  December 16, 2010 at 9:16 am
  To see if I understand your point correctly:
  
  You say that with a good compiler,
```
vector getsomevector() { ... }
vector x = sorted(getsomevector());
```
  there’s no copy for the sort, since the gestomevector temporary() is moved into a local, where the sort happens, which in turn is moved into x?
  peterchenQuote
  - Dave Abrahams says:
    
    December 16, 2010 at 9:33 am
    
    No, I’m saying a good C++03 compiler will neither incur a copy to pass an rvalue into the function, nor will it, in most cases, incur a copy to pass a return value out of a function. There are a few cases where the realities of calling conventions make it impractical to suppress the copy upon return, and this is one of them (think about how the optimization must be implemented and you’ll see why), which explains the need for swap() in my example on real compilers, even good ones.
    
    Dave AbrahamsQuote
    - x.martian says:
      
      December 16, 2010 at 4:48 pm
      
      So now you need to do a swap, which creates more chance for the compiler to goof up.
      
      Why not use this good old
      
      sorted_names sorted2(std::vector& names);
      
      signature. The only thing I need to pray for is a RVO.
      
      x.martianQuote
    - Dave Abrahams says:
      
      December 17, 2010 at 7:51 am
      
      Why don’t you try that with an rvalue argument and find out?
      
      Dave AbrahamsQuote
    - Balakrishnan B says:
      
      January 26, 2011 at 9:57 am
      
      I understood the vector example where an implicit copy is better than explicit copy. But Im not sure what to do in this situation, mystring f(); const mystring s1 = f(); const mystring& s2 = f();
      
      If I want a const mystring which one of the above to use or both equivalent? I created a custom class and observed that both the statements makes equal number of copy constructor calls. Either 0 copy or 1 copy depending upon whether the return value is known at compile time or not. Many of my friends are say catching the return value by reference is efficient. After reading this article I feel that either the first one is better or both are equivalent. Am I right?
      
      Balakrishnan BQuote
- x.martian says:
  
  December 17, 2010 at 12:51 pm
  Why don’t you try that with an rvalue argument and find out?
  
  Yes, indeed. I tried the following with Microsoft VC++10:
```
#include
#include
#include

#define SIZE 1000

class Big
{
public:
    Big() {
        std::cout << "Big constructor" << std::endl;
        //res = malloc(SIZE);
    }

    ~Big() {
        std::cout << "Big destructor" << std::endl;
        //free(res);
    }

    static Big Build() {
        Big big;
        memset(big.res, 0, SIZE);
        std::cout << "first " << &big << std::endl;
        return big;
    }

    static Big Revise(Big value) {
        std::cout << "second " << &value << std::endl;
        memset(value.res, 1, SIZE);
        return value;
    }

    static Big Revise1(Big const& ref) {
        Big value(ref);
        std::cout << "second " << &value << std::endl;
        memset(value.res, 1, SIZE);
        return value;
    }

private:
    char res[SIZE];
};


int _tmain(int argc, _TCHAR* argv[])
{
    Big b = Big::Revise(Big::Build());
    std::cout << "third " << &b << std::endl;
    std::cout << std::endl;
    Big c = Big::Revise1(Big::Build());
    std::cout << "third1 " << &c << std::endl;
    return 0;
}
```
  And here is the result:
```
Big constructor
first 0022F5F0
second 0022EA24
Big destructor
Big destructor
third 0022F208

Big constructor
first 0022F9D8
second 0022EE20
Big destructor
third1 0022EE20
```
  Clearly, passing by value reduced the number of copying of the object.
  x.martianQuote
  - James Hopkin says:
    
    December 20, 2010 at 7:14 am
    I’m a little confused by your results and your comment. The output appears to show that the compiler managed to elide one of the copies only when passed by reference.
    
    For reference, here’s what I get running the same code on gcc4.5.1:
    
    Big constructor first 0xbf8cace0 second 0xbf8cace0 Big destructor third 0xbf8cb0c8 Big constructor first 0xbf8ca510 second 0xbf8ca8f8 Big destructor third1 0xbf8ca8f8 Big destructor Big destructor
    
    Here passing by value results in only one copy. I’m going to look at the optimisation settings to see if it can do better for either case.
    James HopkinQuote
    - James Hopkin says:
      
      December 20, 2010 at 10:04 am
      
      By the way, I tried adding a move constructor to Big. It is only called if I add an std::move to the return statement of Revise. It seems the compiler is failing to identify the return value of Revise as an opportunity to elide the copy (as allowed by 12.8/34 in the FCD), so neither does it implicitly treat the expression as an rvalue (as required by 12.8/35).
      
      Presumably the problem is with trying to elide the copies at both ends (parameter and return).
      
      James HopkinQuote
    - Dave Abrahams says:
      
      December 20, 2010 at 11:29 am
      
      FWIW, the compiler is supposed to implicitly wrap any by-value returns in std::move(…). If adding std::move(…) around your return value makes any difference, that’s either a (knowingly) partially-implmented feature or a bug.
      
      Dave AbrahamsQuote
    - x.martian says:
      
      December 20, 2010 at 11:55 am
      
      Actually, we also have to make sure the compiler does not inline any of functions.
      
      x.martianQuote
  - Dave Abrahams says:
    
    December 20, 2010 at 11:49 am
    
    There’s already a fairly complete test referenced from this comment
    
    Dave AbrahamsQuote

Vincent says:

November 14, 2010 at 1:30 pm

I am curious about one thing: I tried the example you gave above, and slightly modified the nrvo and urvo tests as follows:

Now g++ 4.5 still elides the copy in the urvo case, but not anymore in the nrvo case. It is not crystal clear to me why it is so…

Dave Abrahams says:

November 14, 2010 at 4:13 pm
The caller allocates room on the stack for the return value. In urvo_source, it can construct a new object there, no matter which branch of the if is taken. Now, that happens to be the case for nrvo_source as well, but in general, if the objects are named, they may need to maintain an identity separate from whatever becomes the return value:
```
X nrvo_source(bool (&bf)())
{
    trace t("nrvo_source"); // trace
    X a, b;
    // use a and b here.
    return bf() ? a : b;
}
```
The most likely explanation is that when the compiler sees that two different named objects can be returned, it simply gives up on elision and assumes it needs to copy. That’s a cheap way to make the optimization in many cases without performing flow analysis.

HTH,
Dave AbrahamsQuote
- Vincent says:
  
  November 16, 2010 at 3:54 pm
  
  From where I stand (i.e. far away from compiler design…), it just seems that, in the case of nrvo_source(), the same kind of analysis could be performed in each branch of the if(), as in the whole function in your original example. In other words, it should be feasible for a compiler to realise that, in my example, it just needs to construct either a or b at the address passed by the caller.
  
  Of course, there is nothing it could do when the function is rewritten as you show, so maybe it’s not worth the trouble.
  
  Anyways, thanks a lot for your reply, and for this site in general: lot of good reading ahead
  
  (I have no idea why my code was not properly highlighted: I did use the tilde fences, though)
  
  VincentQuote

prasoon says:

November 8, 2010 at 3:14 am

Excellent Article. Read it twice .

Cheers!!

Marc says:

October 31, 2010 at 7:24 am

Hello,

I’ve been trying to use value semantics a lot recently. It is an interesting way of coding, but sadly copy elision (at least as it is currently implemented in compilers) is too restrictive in many cases. For instance if I have: struct Wrapper : Base { Wrapper(Base const& b):Base(b){} }; this constructor will always cause b to be copied (or moved, but for this post I am interested in types where move isn’t faster than copy), even when it comes from a temporary. Sometimes jumping through hoops with emplace-like constructors allows to work around this, but not always. Also, if I have a struct NonAggregate { std::array<Type,5> member; }; (it is not an aggregate because I will give it constructors), I can’t find a way to initialize member without copying all the elements (for an aggregate I could do Aggregate obj={{Type(),Type(),…}}; ).

Now all of this would change if compilers analyzed their AST and, for every temporary object that is referenced only once and for which that reference is a copy, they collapsed this branch. Obviously I am missing some rather large details, but I believe something like that is necessary if we really want to move towards more value semantics.

Marc says:

November 13, 2010 at 4:23 am

I just noticed that this is precisely Core issue 1049 (great! Go Jason!) whose priority sadly got lowered ((

MarcQuote

Chubsdad says:

October 29, 2010 at 7:04 am

May be this should be reworded like so. “Unlike lvalues, which can always be used on the left-hand-side of an assignment (if the lvalue is non const)”

Dave Abrahams says:

October 29, 2010 at 5:28 pm

Good point, thanks! Fixing…

Dave AbrahamsQuote

Vijay Mathew says:

October 11, 2010 at 1:59 am

I once wrote a short blog post on Value Semantics: http://blog.vmathew.in/value-semantics

Sean Parent says:

May 27, 2010 at 10:55 am

Reading old articles…

Minor comment – if you define a function move() as:

// poor mans 0x move
template <typename T>
T move(T& x) {
  T result;
  swap(x, result);
  return result;
}

Then you can write:

std::vector<std::string>
sorted(std::vector<std::string> names)
{
    std::sort(names);
    return move(names);
}

Sean

Hendrik Schober says:

August 25, 2009 at 10:55 am

So what’s wrong with the following (note the &)?

std::vector<std::string>& const names = get_names();

SG says:

August 29, 2009 at 7:43 am
I think you meant
```
vector<string> const  names = get_names(); // #1
vector<string> const& names = get_names(); // #2
```
Assuming get_names returns by value (and not a reference) and you’re dealing with a “good” compiler there should not be a difference between the two. In the first case (#1) the compiler can make the function construct its return value directly in that space that will be referred to by the name “names”. In #2 the return value that lives on the stack gets an extension of its life-time and a reference to it is created. A good compiler doesn’t even need to allocate space for the reference but I’m not sure if most compilers are that smart (even though they could be).

If get_names returns a reference it makes a big difference, of course. Is the following code safe?
```
string const& blah = string("123") + "456";
```
Currently, it is safe. It’s not safe C++0x (according to the current draft) because operator+(string&&,char const*) returns an rvalue reference –> You get a dangling reference.

In my opinion, people should not write #2 instead of #1 just because they think they can safe a copy. Many current compilers successfully elide the copy in #1. Also, they should not declare functions that return rvalue references (I really hope operator+(string&&,char const*) and others will be fixed) for “temporary recycling” because it opens up the possibility of dangling references.

Cheers! SG
SGQuote
Dave Abrahams says:

August 29, 2009 at 9:51 pm

Trick question? It’s a syntax error: you can’t cv-qualify the reference itself.

Dave AbrahamsQuote

Niels Dekker says:

August 22, 2009 at 6:59 am

Thanks for the article, Dave!

A very minor remark: your recommended copy-and-swap assignment implementation cannot do a fast assignment to itself. Fast self-assignment could be achieved by adding an extra check to the “canonical” version:

T& T::operator=(T const& x) {

// Self-assignment check:
if (&x == this) return *this;
T tmp(x);
swap(*this, tmp);
return *this;

}

Of course, the speed of self-assignment is rarely relevant. But I find it slightly counter-intuitive, having a self-assignment that might fail! But if a user doesn’t even have enough memory to assign something to itself, she’s probably in deep trouble anyway!

Dave Abrahams says:

August 29, 2009 at 9:57 pm

Niels Dekker: Fast self-assignment could be achieved by adding an extra check to the “canonical” version

…which would penalize the usual case and complicate the code in order to optimize a rare case, which is almost always a bad idea.

Niels Dekker: I find it slightly counter-intuitive, having a self-assignment that might fail!

These self-assignments never have the form x = x anyway (nobody knowingly does a self-assignment except in test suites). They’re almost always x = y cases, where x and y may or may not refer to the same object. That means we’re in code that has to cope with an exception anyhow. There’s really zero advantage in making self-assignment a no-throw operation.

Dave AbrahamsQuote
- peterchen says:
  
  September 23, 2010 at 9:22 am
  
  Penalize? You have 3 trivial ops vs. a heap allocation.
  
  You are right that it increases complexity, and typical scenarios are ‘x=y’.
  
  I also agree that the test doesn’t help much if your assignment is copy-and-swap. Still I’d put them in by almost habit, since many assignment operator implementations require careful analysis to proof self-assignment is correct.
  
  peterchenQuote

Ganeshram Iyer says:

August 21, 2009 at 6:59 am

Great article, but could you also include a conclusion with an example of what we should do and not do (i.e. best practices?). Your article seems more like a discussion than a “good rule to follow”, which makes it difficult to pick out the important parts. But definitely thanks for the article. Keep posting more.

Dave Abrahams says:

August 21, 2009 at 10:23 am

Thanks for the feedback! Speaking generally, I think the popular C++ literature is long on rules and short on insight, and I’m not naturally inclined to boil things down to a prescription, so it’s great to know when the important stuff doesn’t stand out.

The guideline is: Pass by value any arguments that you would otherwise copy explicitly.

But it’s just a guideline; as I explain in this comment, you’ll end up generating bigger code if you often pass lvalues that way, and bigger code, in some circumstances can be slower code—or it can just be unacceptable because of its size. So maybe you can see why I don’t dispense a lot of rules.

Dave AbrahamsQuote
- Mathias Gaunard says:
  
  August 26, 2009 at 8:09 am
  That guideline appears a bit wrong. For example, with that kind of code
```
void vector::push_back(const T& t)
{
    ...
    new(buffer) T(t);
    ...
}
```
  It would be silly to pass t by value.
  
  So I believe it should be something like:
  
  Pass by value any arguments that you would otherwise copy explicitly, and for which you do not need to control the storage of the copy.
  Mathias GaunardQuote
  - Dave Abrahams says:
    
    August 29, 2009 at 9:31 pm
    
    See my reply here.
    
    Dave AbrahamsQuote
- Mathias Gaunard says:
  
  August 27, 2009 at 8:44 pm
  
  Or rather, pass by value any arguments that you would otherwise copy explicitly and which you don’t want to control the storage of the copy of.
  
  Indeed, for something like vector::push_back, passing by value is useless, even though you want to explicitly copy the argument.
  
  Mathias GaunardQuote
  - Dave Abrahams says:
    
    August 27, 2009 at 10:33 pm
    
    It’s not entirely useless if you know something about the type being passed, like how to move-construct it or that it can be emulated cheaply with default construct and swap
    
    Dave AbrahamsQuote

Thomas Petit says:

August 20, 2009 at 2:57 pm

It seems that your new style copy assignment operator don’t mix well with rvalue reference in MSVC10 and in gcc 4.4 :

struct X
{
   X(){}
   X&amp; operator=(X x){return *this;}
   X&amp; operator=(X&amp;&amp; x){return *this;}
};
X foo()
{
   X x;
   return x;
}
int main()
{
   X x;
   x = foo();
   return 0;
}

At line 15 (gcc error) :

In function ‘int main()’:|
error: ambiguous overload for ‘operator=’ in ‘x = foo()’|
note: candidates are: X& X::operator=(X)|
note:                 X& X::operator=(X&&)|

I don’t understand. The copy assignment operator takes an lvalue, the move assignment operator take an rvalue and foo returns an rvalue. Where is the ambiguity ? Is this a bug ?

Dave Abrahams says:

August 20, 2009 at 5:12 pm

Hold your horses, people! All (well, much) will be revealed when we cover rvalue refs in the next installment.

Dave AbrahamsQuote
- Pavel Matsula says:
  
  April 5, 2013 at 10:04 pm
  
  Greetings from Russia! It seems to me that your blog posts “Move it with rvalue references” and “Your next assignment” are somehow broken because I can see neither the article nor the comments. That’s why I can’t get whether one really needs a move constructor at all since an lvalue will get a copy to swap with due to the by-value argument passing and an rvalue will be treated as the argument itself to.. again swap with if the compiler does the copy elision! Am I missing something?
  
  Pavel MatsulaQuote
  - Pavel Matsula says:
    
    April 5, 2013 at 10:41 pm
    
    Oh, sorry, I surely meant there might be no need of a move assignment X& operator=(X&& x); , not a move constructor.
    
    Pavel MatsulaQuote

Andrzej Krzemienski says:

August 20, 2009 at 1:15 pm

Hi, This is a couple of unconnected thoughts that hauted me after reading your article.

(1) The copy-and-swap idiom. It implements a copy in terms of swap. Then, the default swap function is implemented in terms of three copies. It is all fine if you always implement customized swap for your classes, but if you don’t you are risking infinite recursion. Wouldn’t it be safer if the idiom was always accompained by the note saying that you need to implement swap too?

(2) In C++0x we will have lval references, rval references, and values (no-references). The Cartesian product with const/non-const qualifier gives 6 possible ways in which we can define function arguments: 1. fun( YourClass v ); // by value 2. fun( const YourClass v ); // const copy. useless?? 3. fun( YourClass & v ); // output parameter 4. fun( YourClass const& v ); // sort of by value, bot no copying 5. fun( YourClass && v ); // temporary that I can change 6. fun( YourClass const&& v ); // useless – #4 would do

Numbers 2 and 4 are probably useless, but it still leaves us with four. Even if we do not want to talk about rval refs right now it leaves us with three, which is one too much. Having spent a number of years programming in C++ I do not find in strange any more, but if you look at it from a new commer perspective there should be only two: either I will change your object, or not. I think Andrei Alexandrescu pointed that out somewhere in the discussion groups. It could be only #1 (interested in value) and #3 (interested in an object – memory location). The other two (4 and 5) are just for performance tweeks, aren’t they? Well, #5 is also about unique ownership, but half of it’s job is still performance, isn’t it? It is troublesome that you have four choices, and after your article, it is clear that it is not clear which one to choose. We may think taht we optimize, but in fact we inadvertently pessimize. If we have only two options and teh support of copy elision, move semantics, and perhaps something newer and even more powerful, we could just write:

fun( set<vector> data );

and be sure that we never add any slowdown. I am not surewhat point I am trying to make here, but Im pretty sure I want to make some point. The two functions below have different argument type. Trying to pick one when matching overload candidates would be ambiguous, but they are still two types. 1. fun( YourClass v ); 2. fun( YourClass const& v ); Why do we need #2? Because it is sometimes faster. Why do we need #1? Not sure. Because it is sometimes faster? Perhaps we do not need #1 at all? If we discard it we can change the syntax of #2 to

fun( YourClass v );

This is the same as #1 used to be, but since we discarded it there is no ambiguity. If there is some programmer’s knowlege required to perform optimization, shouldn’t it rather be provided via attributes:

fun( YourClass [[copy]] v );

But do we need even that? Is the compiler simply not smarter that us?

(4) Value semantics. It has a great suppot in C++, e.g. in form of a implicitly defined copy constructor/assignment. A couple of things were suggested to the Standards Committe to make it even better. I just wanted to list them here: 1. Implicitly defined comparison operator (that is a logical conjunction of member-wise comparisons). It was mentioned in N2326, but never really proposed 2. The definition of “the same”. In N2479. It has a status “Outstanding issues” – not sure what it means. 3. Not generating copy operations implicitly for classes with non-trivial destructors. Proposed in N2904. No idea what its status is.

Regards.

Dave Abrahams says:

August 20, 2009 at 2:53 pm

Andrzej Krzemienski: Hi, This is a couple of unconnected thoughts that hauted me after reading your article.

Heh, just a couple?

This is a whole article unto itself! Thanks for your contribution; I may have to respond in pieces.

(1)The copy-and-swap idiom. It implements a copy in terms of swap.

An assignment, I think.

Then, the default swap function is implemented in terms of three copies.

A copy and two assignments, but…

It is all fine if you always implement customized swap for your classes, but if you don’t you are risking infinite recursion. Wouldn’t it be safer if the idiom was always accompained by the note saying that you need to implement swap too?

…point taken.

I’ll try to get to the rest of your material soon, but let me say now that a lot of what you’ve written sounds a lot like thoughts I‘ve been having lately. I ask the question this way: what would a language that was designed to support mutable value semantics look like?

Dave AbrahamsQuote
Sebastian Redl says:

August 24, 2009 at 5:15 am

(1) That’s why it’s customary to use the member swap in the copy-and-swap idiom:

T& operator =(T t) { t.swap(*this); return *this; }

No chance of infinite recursion, unless you implemented member swap in the canonical way, and that would be stupid.

(2) Const rvalue references are useless to the programmer and should never appear written in a program. (They can appear through template deduction.) Const by-value arguments are also useless, so as you say, we’re down to 4 variants. Let’s leave rvalue references out for the moment. We deal with by-ref, by-val and by-const-ref. Combining the usual C++ wisdom with this article pretty much leads to these guidelines: - Unconditionally use by-ref is for out or in-out parameters. However, reconsider whether you need out parameters, because returning might be just as efficient. - Unconditionally use by-val for arguments that are cheap to copy (primitives) and by-const-ref for arguments that are not copyable. - This leaves arguments that are copyable, but it is expensive to do so. This article essentially says that you should pass these by-val iff you plan to modify them inside the function, but don’t want the modifications to be visible outside. The downside of this approach is that you leak an implementation detail into the interface: if you have a traditional assignment operator, you should pass the argument by-const-ref, but if you convert it to a copy-and-swap assignment operator, you should change it to by-val.

Sebastian RedlQuote
- Andrzej Krzemienski says:
  
  August 24, 2009 at 7:58 am
  
  Your guidelines are clear and fine. But what I was writng about was more fantasizing how a more perfect language could look like. In fact, one aspect could be achieved only by even more advanced compiler optimization technique. As I have little (i.e. none at all) familiarity with compilers, I may still be really fantasizing, but just consider:
  
  FatCopy fc = prepare(); read1( fc ); read2( fc );
  
  Two read functions do not alter the parameter; we are used to declaring:
  
  void read1( FatCopy const & fc );
  
  This is in order to avoid copying. This, in turn, is because we are used to think that:
  
  void read1( FatCopy fc );
  
  means copying. The way I have been taught C++, one thinks that the above line means passing data by copying, unless copy elision is employed. How about changing our thinking to “passing data by value”: there will be no copying, unless the function really changes value fc and copying is really unavoidable. The compiler will decide to use your copy constructor only if necessary, and in situations where you would need a copy anyways. This could be achieved as follows: The compiler always compiles
  
  void read1( FatCopy fc );
  
  as passing by reference, and if it finds that fc is modified by read1, it marks it “somehow” this fact in that function’s meta data, and later, it adds a copy at the call site while linking, so the calling function would be compiled to:
  
  FatCopy fc = prepare(); FatCopy __copy = fc; read1( __copy ); read2( fc );
  
  The copy is only if read1 modifies fc. Otherwise there is no copy whatsoever.
  
  Regards.
  
  Andrzej KrzemienskiQuote
  - Tony Van Eerd says:
    
    July 19, 2011 at 4:56 pm
    
    I would call this a form of “copy on write”, at the compiler level. If you were looking for a name/idiom.
    
    P.S. sounds like a good idea to me.
    
    Tony Van EerdQuote
    - Dave Abrahams says:
      
      August 17, 2011 at 3:47 pm
      
      Actually, I’ve been talking/thinking about exactly Andrzej’s idea for about a year now, and calling it “compile-time copy-on-write.” I think it’s time for the article about ideas for the “ideal language in the spirit of C++.”
      
      Dave AbrahamsQuote
Dave Abrahams says:

August 29, 2009 at 10:06 pm

Andrzej Krzemienski: The copy-and-swap idiom. It implements a copy in terms of swap. Then, the default swap function is implemented in terms of three copies. It is all fine if you always implement customized swap for your classes, but if you don’t you are risking infinite recursion. Wouldn’t it be safer if the idiom was always accompained by the note saying that you need to implement swap too?

Maybe, but it’s probably not as scary as you think. Outside namespace std, an unqualified call to swap won’t find std::swap unless you’ve explicitly brought it into scope with a using-declaration.

Dave AbrahamsQuote

Brad says:

August 19, 2009 at 6:16 pm

Dave, et al. Thanks much for this site and the material so far.

Dave Abrahams says:

August 19, 2009 at 6:40 pm

Nice to hear from you, Brad! And, you’re most welcome. Thanks for being a part of what has been a very gratifying response so far.

Dave AbrahamsQuote

Maxim Yanchenko says:

August 19, 2009 at 5:49 pm

One consideration about the following:

Consider this cousin of our original sorted(…) function, which takes names by const reference and makes an explicit copy:
std::vector
sorted2(std::vector const&amp; names) // names passed by reference
{
    std::vector r(names);        // and explicitly copied
    std::sort(r);
    return r;
}

In principle, the fact that the function makes a copy is an implementation detail, which is invisible if you use const vector&. When you switch to pass-by-value, you are effectively exposing your implementation to the interface of the function, so in case you later come up with another solution that doesn’t involve copying, you’ll switch to reference with the obvious signature change, that at least will require your clients to recompile (fortunately, their code won’t change unless they are using the signature explicitly, say as a function pointer). Without this, they would just replace .so/.dll with the new version.

It’s not a major problem, just a point to consider. After all, almost no explicit optimization comes at zero price. (Btw, if elimination of reference-bound temporaries was allowed, it would work in this case as well!)

Dave Abrahams says:

August 19, 2009 at 6:37 pm

First, yeah this is just another manifestation of the code size issue explained in this comment.

However #1, I have a hard time seeing a switch to pass-by-value as exposing an implementation detail. I’m not sure why; it may just be a gut reaction, but my orientation is toward pass-by-value as a default, and pass-by-reference as an optimization. Nothing obliges the function to modify or steal resources from the copied value, and the copy ought to be side-effect-free. Okay, I’ll admit I’m flailing about in the dark hoping to hit the right explanation for why it’s not an implementation detail. Let me give that some thought.

However #2, I totally disagree that legalizing your optimization—which I won’t even call “elimination of reference-bound temporaries” because you surely would not want that to be allowed in all cases—would help with the recompilation problem. It’s like I said earlier: your optimization depends on being able to see inside the called function, which is in conflict with the separate compilation model.

Dave AbrahamsQuote
- Andrzej Krzemienski says:
  
  August 20, 2009 at 10:55 am
  
  Hi, Your “However #1″ is somehow very inspiring. When I type the function:
  
  int double1( int const& i ) { return 2*i; }
  
  and then change it to:
  
  int double2( int i ) { return 2*i; }
  
  No-one will say taht I exposed any imlementation detail. I just want a value. In fact I would probably never write double1, because double2 is so natural: “just give me this integral value”. double1, on the other hand says: “I will take it by reference”. Imagine a function call:
  
  return double2(5);
  
  Who is interested in knowing that you will have yet another name to refer to the 5? I just want to double it. Now, ‘const’ means “I will not try to change it, so pas me literals and temporaries as well”. This is “weird” too. I wasn’t asking whether you would be changing it or not, I just wanted to double the number, but now I have to puzzle about whether someone will mutate my value or not. And in fact “int const&” somehow isn’t as much mutation-proof as “int” alone. You can cast away const, but you cannot cast the copy back onto the original.
  
  Also the change from double1 to double2 only exposes a copy operation (and the destructor). Not any other function. Copy operation is special to that extent that compilers implement it for you for your own classes. Function:
  
  void fun( YourClass cc );
  
  Doesn’t expose any function of YourClass. It simply requires a value. Well, I know it is just some loose thoughts.
  
  Regards.
  
  Andrzej KrzemienskiQuote
- x.martian says:
  
  December 15, 2010 at 5:58 pm
  
  I have a hard time seeing a switch to pass-by-value as exposing an implementation detail.I’m not sure why;
  
  Here is why: If you header file looks like:
  
  class B; class A {public: void Foo(B& b);};
  
  The user does not need to know the definition of B. On the other hand, if your header file looks like:
  
  include “B.h”
  
  class A {public void Bar(B b);};
  
  Now the user of your class is forced to know B.
  
  x.martianQuote

Sebastian says:

August 17, 2009 at 11:29 am

It seems you’re going to “blogify” the “RValue Reference 101″ article which is nice because it is currently inaccessible to users who don’t have a trac account at boostpro.

Cheers! Sebastian

Dave Abrahams says:

August 17, 2009 at 12:23 pm

It started with that article, but it’s being extensively revised and expanded.

Dave AbrahamsQuote

August 17, 2009 at 10:33 am

Hi Dave,

Could you please comment on this note in the Standard (12.8/15) “when a temporary class object that has not been bound to a reference (12.2) …” Why there is this “has not been bound to a reference” constraint? Consider this (no idea how to get syntax-highlighted code here, a “how to” link would be great):

Due to the constraint, holder_1::s can’t be initialized directly from “some string”, and a temporary string will always be created.

And this constraint appears to be in the latest draft as well.

Thanks

Dave Abrahams says:

August 17, 2009 at 12:21 pm

Maxim Yanchenko: Consider this (no idea how to get syntax-highlighted code here, a “how to” link would be great):

Please see the new “posting” tab.

struct string
{
    const char* str;
    string(const char* str) : str(str) { puts("constructor"); }
    string(const string&amp; other) : str(other.str) { puts("copy constructor"); }
};
struct holder_1 { string s; holder_1(const string&amp; s) : s(s) {} };
int main()
{
    holder_1("some string");
}

I don’t think I agree with your analysis of this example. The type of "some string" is char[12], and it must be converted to a string in order to match the signature of holder_1‘s ctor in line 7 long before holder_1::s is initialized. There’s no opportunity to use the ctor in line 4 as far as I can tell. Am I missing something?

Maxim Yanchenko says:

August 17, 2009 at 4:03 pm
Well, to me, copyctor elision is (conceptually) something like “get rid of creating a temporary if it’s being used only to initialize an object of the same type” (please correct me right here if I’m wrong)? and the wording in the standard is, well, just wording, which is subject to change (e.g. there are two more allowed cases in the latest C++0x draft comparing to 2003). To do the elision, compiler should look ahead anyway? to check the usage of the temporary. E.g. here:
string s = "some string";
The sequence of calls should be string(const char*)->string(const string&), but the compiler “looks a bit further and sees” that the copyctor initializes from a temporary, and gets rid of it (of course, making all necessary checks about copyctor availability).
I don’t see why it can’t be done here as well, as the compiler has all code, everything is inline etc. It’s an optimization, and the compiler should be clever enough to look ahead (and we know they are in many cases, like link-time optimization). But here it’s simply forbidden by the standard because the temporary has been bound to a reference. That’s why I’m asking why do we have this constraint.
Maxim YanchenkoQuote
- Dave Abrahams says:
  
  August 17, 2009 at 6:29 pm
  Maxim Yanchenko: It’s an optimization, and the compiler should be clever enough to look ahead (and we know they are in many cases, like link-time optimization). But here it’s simply forbidden by the standard because the temporary has been bound to a
  
  OK, I understand what you’re asking, though I’m not at all sure that eliminating the text in question would be enough to allow that particular optimization. For what it’s worth, your idea is in a completely different class from today’s copy elision, because the current optimizations only “look a bit further” at the call site, and don’t require looking inside the callee as yours does, which in principle is possible though it may be too late by link time, practically speaking (too much high-level information missing by then).
  
  I don’t really know why we have that constraint (“see D&E” is my stock answer), but if I had to guess I’d say it was there to prevent lvalues from being mutated in scenarios like
  X produce(); int consume(X); X& a = produce(); int b = consume(a);
  If line 5 mutated a that would be pretty surprising.
  Dave AbrahamsQuote
  - Rodrigo says:
    
    August 17, 2009 at 6:54 pm
    
    I read somewhere that it is because the committee agreed (for C++98) that some use-cases exist where monitoring the number of copies is useful. I’m completely opposed to this; the programmer that needs to monitor the number of copies created also needs to know the strange cases where the copy is allowed to be avoided.
    
    RodrigoQuote
  - Dave Abrahams says:
    
    August 17, 2009 at 10:23 pm
    
    Rodrigo: I read somewhere that it is because the committee agreed (for C++98) that some use-cases exist where monitoring the number of copies is useful.
    
    I would be very surprised if the particular optimization Maxim was asking for was ever considered, and even more surprised if it were ruled out on those grounds. That seems completely inconsistent with the intent of copy elision.
    
    Dave AbrahamsQuote
  - Rodrigo says:
    
    August 18, 2009 at 12:34 pm
    
    Yes sorry Dave, I misread it.
    
    However, Alex Stepanov in his notes and his latest book smartly points that if we could force construction, copying and equality to retain their expected semantics, the compiler could apply that optimization.
    
    The committee, prefering freedom, didn’t enforce any semantics. It means that
    
    struct s { s(const char*); s(const s&); };
    
    bool operator==(const s&, const s&);
    
    s a1 = “123″; s a2 = a1;
    
    (a1 == a2); // could be false, the programmer could rely on it being false!
    
    then, we simply aren’t allowed to rewrite s a2 = a1; to s a2 = “123″;
    
    RodrigoQuote
  - Dave Abrahams says:
    
    August 18, 2009 at 12:54 pm
    
    I think you mean “the programmer couldn’t rely on it being true!” In any case, yes, Elements of Programming and regular types will definitely be topics in upcoming articles.
    
    Dave AbrahamsQuote
  - Maxim Yanchenko says:
    
    August 19, 2009 at 3:23 am
    
    Rodrigo: s a1 = “123″; s a2 = a1;
    
    a1 is not a temporary here, so it doesn’t apply. Here is my conceptual understanding for the copy elision:
    
    “If a temporary object a1 is only used to initialize another object a2 of the same type, a1 can be eliminated and a2 can be directly initialized using a1′s initializer”
    
    In your example a1 is also used later in comparison (and it’s not a temporary object at all), so it can’t be eliminated.
    
    Rodrigo, Dave, do you agree with this conceptual definition for the copy elision?
    
    Maxim YanchenkoQuote
  - Rodrigo says:
    
    August 19, 2009 at 6:54 am
    
    In any case yes.
    
    One thing I hate from C++ is the copy-ellision rules. It would be easier if the problem you point were surely caused by a weak compiler instead of a C++ rule.
    
    While I’m not completely sure if your optimization is currently allowed in C++, for me it makes sense.
    
    Btw my point was that a copy from an object could be different from an object created using the same initializer from the source.
    
    RodrigoQuote
  - Seth says:
    
    June 12, 2011 at 6:10 pm
    
    Well said, Dave! What they probably mean is that they accidentally do URVO in debug mode. I wonder what happens if someone reported THAT as a bug
    
    SethQuote
  - Maxim Yanchenko says:
    
    August 17, 2009 at 9:47 pm
    
    Dave Abrahams:it may be too late by link time, practically speaking (too much high-level information missing by then).
    
    Well, I mentioned link time just to emphasize the power of present day optimizers. For the code in question, all analysis can be done in compile time. It’s optimization, which is never mandatory, so we should be ready for the cases when it doesn’t work (e.g. it falls to link time).
    
    Dave Abrahams: If line 5 mutated a that would be pretty surprising.
    
    It looks like you have meant const X& a = produce();, right? Something strange with the markup.
    
    Well (all the latter is not strictly according to the standard definitions), a is not a “real temporary” here: the object returned by produce is, but when you bind it to a reference that extends its lifetime, it’s not a temporary anymore in terms of elision, as you explicitly say “I want this object to live beyond the full expression it was created in”. OTOH, it applies to all other references as well (e.g. references in parameters), so probably they should be also subject to elision.
    
    As long as a is used (and only used) to initialize a temporary (I believe this is the necessary precondition for the elision) in the consume‘s parameter, I see no problem with the a optimized out.
    
    But this should probably also mean that we don’t rely on the destructor of X (obvious usage is the Guard idiom).
    
    I need to think more about it, I still don’t have clear picture.
    
    Maxim YanchenkoQuote
  - litb says:
    
    September 1, 2009 at 2:52 pm
    I think there is another version of that. Consider:
    
    void f() { try { prolly_failing(); } catch(Exception &e) { Exception e1(e); e1.modify(); // oops, e is modified potentially } }
    
    Since exception objects are temporaries too, the restriction about reference binding makes the irritating case of above not possible.
    litbQuote
  - litb says:
    
    September 2, 2009 at 6:36 am
    Hi there. A similar example where we would mutate a temporary that has got a name:
    
    void f() { try { dangerous(); } catch(Exception &e) { // (seemingly) copy it then (seemingly) modify // only the copy Exception copy(e); copy.enjoy(); throw; } }
    
    We would accidentally modify the exception object. I think the restriction of the reference binding will generally make it so we can’t refer to the temporary object a second time.
    litbQuote

unomadh says:

May 27, 2012 at 8:28 am
Hello Dave, Sweet arcticle, brought me in understanding stratums about so many things, thanks a lot.

A question now, given this piece of code:
```
struct T
{
    T()                         { cout << "+ " << this << endl; }
    T(T && f)                   { cout << "+ " << this << " move " << &f << endl; }
    T(T const& f)               =delete;//{ cout << "+ " << this << " copy " << &f << endl; }
};
T func (T f) {
    return T {};
}
int main () {
    T t2 = func(T {});
}
```
Output:
```
+ 0x7fffb4e139a0
+ 0x7fffb4e13980
```
Meaning no copy due to elision, and even no move. This code compiles (GCC 4.7) event if copy constructor is explicitly deleted, but isn't required in theory ? Is it a standard behaviour ? On the other it doesn't compile with deleted move constructor, even if not used just like copy.

Thanks
unomadhQuote
- Marc says:
  
  June 4, 2012 at 8:41 pm
  
  Meaning no copy due to elision, and even no move. This code compiles (GCC 4.7) event if copy constructor is explicitly deleted, but isn’t required in theory ? Is it a standard behaviour ? On the other it doesn’t compile with deleted move constructor, even if not used just like copy.
  
  Note that you can disable elision with -fno-elide-constructors to see the difference. There is no copying in your example, only moves. Where do you expect you might need a copy construction?
  
  MarcQuote
  - unomadh says:
    
    June 4, 2012 at 8:50 pm
    
    Huh that’s right. So the compiler seems to request a copy constructor even with the elision. Is it right ?
    
    unomadhQuote

Joel Falcou says:

August 16, 2009 at 1:37 pm

Dave Abrahams It looks vaguely as though you’re trying to show something about const vs. non-const member functions, but that distinction doesn’t make any difference to copy elision, so maybe I’m misunderstanding.

Well I was trying to have a function like the sort in the article and see what happens when I call it. The memory allocation is here so i have costly things going on when copy ellision is not done.

Basically, is there a 2-3 classes examples that when compile tells : look here ellision, there no ellision.

Dave Abrahams says:

August 16, 2009 at 6:41 pm

Try this example, whipped up at home with loving hands.

Dave AbrahamsQuote
- Joel Falcou says:
  
  August 16, 2009 at 11:52 pm
  
  Congrats to the cook then. This is utterly perfect. Tested with gcc 4.3 I only miss 1 ellision in the “Return rvalue passed by value” case. On 4.1, all ellisions fail. Gotta text MSVC.
  
  Thansk again
  
  Joel FalcouQuote

August 16, 2009 at 8:11 am

Ok. So I wanted to see if I can “checj” this things work with my current g++. So, I wrote that : http://codepad.org/craBkxXL

The output is, for gcc 4.3 : non-const call A new : 4 A delete : 6 A copy : 3

non-const call B new : 4 B delete : 6 B copy : 3

The non-const call is there to “emulate” the sort function from the article.

So, does this means the copy-stuff is done or do I don’t call the proper thing and, hence, not trigger the mecanism or is gcc 4.3 not copy-ellision aware (which I doubt) ?

I think I’m just not doing the correct thing to check this. So how should I butcher this so I can validate the use of copy ellision and show this to unbeliever co-worker ?

Dave Abrahams says:

August 16, 2009 at 12:19 pm

Joel, it’s a little hard to tell what you’re trying to demonstrate with this example. It looks vaguely as though you’re trying to show something about const vs. non-const member functions, but that distinction doesn’t make any difference to copy elision, so maybe I’m misunderstanding. Try cutting it down to the absolute minimum (e.g. remove all that memory allocation stuff) and if you’re testing several different things, separate those tests as well.

Dave AbrahamsQuote
Agnostic says:

November 9, 2012 at 6:27 pm

The behavior depends on compiler options and on way how temporary is objained http://ideone.com/QYYZKl

gcc 4.3 non-const call A new : 5 A delete : 7 A copy : 3

non-const call B new : 3 B delete : 5 B copy : 1

AgnosticQuote

August 16, 2009 at 4:51 am

Very interesting article !

Sure, copy elision really mess with our C++ programmer’s “common sense”. It’s quite surprising to find a better way of writing assignment operator after all these years. There is certainly a lot of textbook to correct.

However, exploiting copy elision further than that, especially for return value, seems a bit fragile to me : 1) It’s hard to check if RVOs really take place, unless adding I/O in constructors. 2) Turn on debug mode and all these nice copy elisions disappear. (at least with MSVC)

“we have all the background we need to attack move semantics, rvalue references, perfect forwarding, and more as we continue this article series. See you soon!”

Great ! I’m really looking forward to seeing that. There is a lot of resource on move semantic out there, but it’s still very confusing to me. For example, I’ve yet to see a straightforward explanation of what is the correct and efficient way of writing functions like “sorted” in presence of move semantic.

std::vector//&& ?
sorted(std::vector/&& ?/ names)
{
    std::sort(names);
    return /std::move ?/names;
}

Dave Abrahams says:

August 16, 2009 at 11:14 am

Thomas Petit: However, exploiting copy elision further than that, especially for return value, seems a bit fragile to me : 1) It’s hard to check if RVOs really take place, unless adding I/O in constructors.

Any side-effect will do; you can increment a counter.

2) Turn on debug mode and all these nice copy elisions disappear. (at least with MSVC)

I just did a quick test with the MSVC 2010 beta, and for that version you’re half right (elisions of arguments passed by value still happen in debug mode) update: see below . That’s a bit surprising, actually: a copied return value doesn’t help make debugging much easier (especially when argument copies are still getting elided), is likely to be misleading for those people looking for release-mode performance, could actually make debug-mode performance unacceptable even for testing, and it doesn’t take significantly more compile-time resources to do the elision. Once you implement RVO it seems like more work to leave a branch in the compiler where it’s disabled.

It is true that copy elision isn’t guaranteed by the standard, so vendors are free not to implement it, or to turn it off depending on compiler options. But is “fragile” really the right word? It’s not as though any vendor who implements copy elision can afford to break it in their next release. Also, given the lack of standard guarantees, I don’t see why you’d be more worried about exploiting return value elisions than argument copy elisions.

Dave AbrahamsQuote
- Dave Abrahams says:
  
  August 16, 2009 at 11:32 am
  
  Actually, I take it back; you’re a quarter right (sorry)! MSVC implements the URVO in debug mode (it will return an unnamed temporary without copying) but not the NRVO. Silly compiler; I’ve submitted a bug report, FWIW.
  
  Dave AbrahamsQuote
  - Dave Abrahams says:
    
    August 19, 2009 at 7:04 pm
    
    Dave Abrahams: I’ve submitted a bug report, FWIW.
    
    And… it turns out not to be worth much at all: http://tinyurl.com/silly-compiler
    
    Dave AbrahamsQuote

Dean Michael Berris says:

August 16, 2009 at 2:57 am

Thanks for the great article. I however have a slightly deeper question regarding rvalues, move semantics, and copy elision:

How do you ensure that the object passed by value is definitely copied and the copy is not elided by the compiler? Also, how then do you make sure that you really do have a copied object instead of a move-constructed object as a function parameter?

void foo(T t); // how to make sure t is copy constructed, instead of move-constructed?

Dave Abrahams says:

August 16, 2009 at 10:08 am

Dean Michael Berris: How do you ensure that the object passed by value is definitely copied and the copy is not elided by the compiler?

If you want to be sure to avoid copy elision (why would you want to do that?) then you need to pass an lvalue. Copies of lvalue function arguments are never elided.

Also, how then do you make sure that you really do have a copied object instead of a move-constructed object as a function parameter? void foo(T t); // how to make sure t is copy constructed, instead of move-constructed?

Well, we haven’t even touched move construction yet, but as long as you’re bringing it up here, again I wonder why you’d want to do that? The answer is the same: pass an lvalue.

Dave AbrahamsQuote

Edouard Alligand says:

August 16, 2009 at 2:22 am

Very interesting article. I tend to “const &” by default and this article reminds me that this is not as optimal as I thought.

August 15, 2009 at 10:30 pm

I’m intrigued on how we can check, for a given compiler at had, if it’s actually doing copy ellision and generate the proper code when we’re using such idioms. I guess adding I/O in the constructor don’t help as it’ll force the ctor to be called. Do you just check the assmebly output or what ?

Moreover, does this means that we can completely and forever dump the old pass-by-const-reference & copy operator= ?

Dave Abrahams says:

August 15, 2009 at 10:42 pm

“I guess adding I/O in the constructor don’t help as it’ll force the ctor to be called.”

Actually, no! Copy elision is explicitly allowed even if the copy constructor and destructor have side-effects. I/O is a great way to see these effects in action.

“does this means that we can completely and forever dump the old pass-by-const-reference & copy operator= ?”

Actually, yes! Dump it yesterday.

Dave AbrahamsQuote
- Joel Falcou says:
  
  August 15, 2009 at 10:46 pm
  
  Wow. I’ll give it a shot.Be sure i’ll update my C++ lessons to incorporate this
  
  Joel FalcouQuote
  - Dave Abrahams says:
    
    August 17, 2009 at 8:27 am
    
    Joel Falcou: On 4.1, all ellisions fail.
    
    Whoa, seriously?! Huge regression, if so (4.0 elides as expected). Are you sure you don’t have -fno-elide-constructors in the spec file or something?
    
    Dave AbrahamsQuote
    - Joel Falcou says:
      
      August 20, 2009 at 1:48 pm
      
      well i just did g++-4.1 ellide.cpp -o ellide I’ll check with 4.2 and 4.4
      
      Joel FalcouQuote
    - Dave Abrahams says:
      
      August 20, 2009 at 2:54 pm
      
      So did you check the spec file? BTW, elision works with 4.3
      
      Dave AbrahamsQuote
  - Dave Abrahams says:
    
    December 23, 2011 at 1:18 pm
    
    Note: you’ll have to bring the pass-by-reference signature back when you start writing move constructors. The pass-by-value idiom creates an ambiguity otherwise
    
    Dave AbrahamsQuote

dvi says:

August 15, 2009 at 6:31 pm

Could you elaborate further on this paragraph?

“First, when you pass parameters by reference and copy in the function body, the copy constructor is called from one central location. However, when you pass parameters by value, the compiler generates calls to the copy constructor at the site of each call where lvalue arguments are passed. If the function will be called from many places and code size or locality are serious considerations for your application, it could have a real effect.”

I don’t quite follow it. Good article though, thanks!

Dave Abrahams says:

August 15, 2009 at 9:49 pm
@dvi: first, you’re most welcome, and thanks for asking for clarification; it helps to know when I fail to connect.

So here, f takes its argument by value, and does whatever it does… but it doesn’t copy a because (copy elision aside) it already has a copy of whatever was actually passed.
1 2 3 4 5 6 7 8 9 10 11
void f(X a) { … modify(a); … } void g() { X b; f(b); // call f with an lvalue } void h() { X c; f(c); // call f with an lvalue }
In this case, the compiler has to generate calls to X‘s copy constructor in the body of g and h at lines 5 and 10. That’s a total of two calls. Probably not a big deal, but in some embedded applications, for example, there’s limited space available for code.

Now compare with what happens when f takes its argument by reference and copies it:
void f(X const& r) { X a(r); // explicit copy … modify(a); … }
Now there’s just one call to X‘s copy constructor, in the body of f. The exact same definitions of g and h still work, but calling f no longer involves copying at the call site.

Does that help?
Dave AbrahamsQuote
- dvide says:
  
  August 15, 2009 at 9:56 pm
  
  Ah yes that makes it clear. Thanks again
  
  dvideQuote

C++Next